Document Re-ranking via Wikipedia Articles for Definition/Biography Type Questions
نویسندگان
چکیده
In this paper, we propose a document re-ranking approach based on the Wikipedia articles related to the specific questions to re-order the initial retrieved documents to improve the precision of top retrieved documents in Chinese information retrieval for question answering (IR4QA) system where the questions are definition or biography type. On one hand, we compute the similarity between each document in the initial retrieved results and the related Wikipedia article. On the other hand, we do clustering analysis for the documents based on the K-Means clustering method and compute the similarity between each centroid of the clusters and the Wikipedia article. Then we integrate the two kinds of similarity with the initial ranking score as the last similarity value and re-rank the documents in descending order with this measure. Experiment results demonstrate that this approach can improve the precision of the top relevant documents effectively.
منابع مشابه
Highlighting Entanglement of Cultures via Ranking of Multilingual Wikipedia Articles
How different cultures evaluate a person? Is an important person in one culture is also important in the other culture? We address these questions via ranking of multilingual Wikipedia articles. With three ranking algorithms based on network structure of Wikipedia, we assign ranking to all articles in 9 multilingual editions of Wikipedia and investigate general ranking structure of PageRank, Ch...
متن کاملExtracting and Ranking Question-Focused Terms Using the Titles of Wikipedia Articles
At the NTCIR-6 CLQA (Cross-Language Question Answering) task, we participated in the Chinese-Chinese (C-C) and English-Chinese (E-C) QA (Question Answering) subtasks. Without employing question type classification, we proposed a new resource, Wikipedia, to assist in extracting and ranking Question-Focused terms. We regarded the titles of Wikipedia articles as a multilingual noun-phrase corpus w...
متن کاملKISTI at TREC 2014 Clinical Decision Support Track: Concept-based Document Re-ranking to Biomedical Information Retrieval
With fast development of medical information systems and software, clinical decision support (CDS) systems continue to develop new methods to deal with diverse information coming from heterogeneous sources such as a large volume of electronic medical records (EMRs), patient genomic data, existing genomic pharmaceutical databases, curated disease-specific databases, peer-reviewed research, etc. ...
متن کاملComparative Evaluation of Link-Based Approaches for Candidate Ranking in Link-to-Wikipedia Systems
In recent years, the task of automatically linking pieces of text (anchors) mentioned in a document to Wikipedia articles that represent the meaning of these anchors has received extensive research attention. Typically, link-to-Wikipedia systems try to find a set of Wikipedia articles that are candidates to represent the meaning of the anchor and, later, rank these candidates to select the most...
متن کاملRanking Automatically Generated Questions as a Shared Task
We propose a shared task for question generation: the ranking of reading comprehension questions about Wikipedia articles generated by a base overgenerating system. This task focuses on domain-general issues in question generation and invites a variety of approaches, and also permits semi-automatic evaluation. We describe an initial system we developed for this task, and an annotation scheme us...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009